A speaker verification backend with robust performance across conditions

نویسندگان

چکیده

In this paper, we address the problem of speaker verification in conditions unseen or unknown during development. A standard method for consists extracting embeddings with a deep neural network and processing them through backend composed probabilistic linear discriminant analysis (PLDA) global logistic regression score calibration. This is known to result systems that work poorly on different from those used train calibration model. We propose modify backend, introducing an adaptive calibrator uses duration other automatically extracted side-information adapt inputs. The trained discriminatively optimize binary cross-entropy. When number diverse datasets are labeled only respect speaker, proposed consistently and, some cases, dramatically improves calibration, compared PLDA approach, held-out datasets, which markedly training data. Discrimination performance also improved. show joint essential -- same benefits cannot be achieved when freezing fine-tuning calibrator. To our knowledge, results paper first evidence literature it possible develop system robust out-of-the-box large variety conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An i-vector backend for speaker verification

We propose a new approach to the problem of uncertainty modeling in text-dependent speaker verification where speaker factors are used as the feature representation. The state-of-the-art backend in this situation consists in using point estimates of speaker factors to model the joint distribution of pairs of enrollment and test feature vectors under the same-speaker hypothesis. We develop a ver...

متن کامل

A Robust Framework for Forensic Speaker Verification

This paper discusses the application of automatic speaker verification systems in forensic casework. A framework for reporting the system outcome is proposed. Specific system requirements to properly cope with forensic idiosyncrasies are analyzed through a series of simulations. Results suggest that the design of a forensic speaker verification system not necessarily match the settings of curre...

متن کامل

Environment adaptation for robust speaker verification

In speaker verification over public telephone networks, utterances can be obtained from different types of handsets. Different handsets may introduce different degrees of distortion to the speech signals. This paper attempts to combine a handset selector with (1) handset-specific transformations and (2) handset-dependent speaker models to reduce the effect caused by the acoustic distortion. Spe...

متن کامل

Segmental Normalization for Robust Speaker Verification

For the task of speaker verification, similarity measure normalization methods are relevant to cope with variability problems and with data and/or decision fusion problems. The aim of this paper is to suggest a new method of normalization which combines classical world model based normalization techniques with ones based on a posteriori probability. This original method presents the well-known ...

متن کامل

Noise-robust speaker verification using F0 features

This paper proposes a noise-robust speaker verification method augmented by fundamental frequency (F0). The paper first describes a noise-robust F0 extraction method using the Hough transform. Then, it proposes a robust speaker verification method using multi-stream HMMs which fuse the extracted F0 and cepstral features. Experiments are conducted using fourconnected-digit utterances of Japanese...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Speech & Language

سال: 2022

ISSN: ['1095-8363', '0885-2308']

DOI: https://doi.org/10.1016/j.csl.2021.101258